Introduction

The garment industry is responsible for approximately 4% of global emissions, mostly from the production process and is extremely resource-intensive (1). In response, a system of voluntary-sustainability standards (VSS) has emerged. Companies pledge to meet certain environmental standards, such as specific standards for sourcing organic cotton. However, these third-party standards are actors in their own right - often managed through influential non-profits and international organizations. Dynamics between a standard-setter and an apparel company have not yet been broadly explored. This blog post introduces a proof of concept for the quantitative aspect of my master’s thesis that will explore the dynamics between VSS and garment companies.

Building the Dataset

The network that I am studying is a two-mode, bipartite network where companies and VSS are distinct node sets and edges form between the two, but not within the groups. For the context of this class assignment, I needed to limit my data collection so I decided to limit the node set by company, selecting a subset of 11 apparel companies and then developing an edgelist based on VSS that they have adopted. I based my subset on the McKinsey Global Fashion Index (MGFI), a list of the top apparel companies based on average economic profit covering 2019 and 2020 (2).

I want to note that there is correlation between the top companies by revenue versus profit, but the two lists aren’t identical so my list excludes actors like H&M that did not rank on MGFI. Furthermore, MGFI looks at publicly-held companies because information is more readily available, compared to the relative opacity of privately-held companies such as Shein (3). However, these “ultra-fast-fashion” companies are quickly growing in popularity. Future iterations of my dataset will need to account for both revenue and profit - potentially comparing if profitability makes a difference in what standards are adopted and for privately-held brands with limited publicly-accessible data.

Once I selected my 11 companies, I then went through all of their most recent ESG reports, annual reports, and press releases and developed my edgelist based on which VSS they publicly disclosed adopting (4). Part of my broader hypothesis is that brands adopt VSS as a result of consumer and shareholder pressure, thus motivating the public disclosure of VSS adoption, facilitating my data gathering. I also verified on the websites of VSS standards to ensure the most accurate data collection possible (5). Obviously, this form of data gathering is prone to mistakes. For instance, as I am interested in environmental standards, I excluded labor standards, but there is a possibility that certain labor standards also have environmental components.

The types of products sold also impacts VSS adoption and this is not accounted for in my dataset. A company like Lululemon that does not sell products incorporating gold will naturally not adopt gold VSS.

Dataset Summary

At the end of my data gathering, I had built a bipartite network with 11 company nodes with the nodal attributes of

  • 2019/2020 profitability in USD

  • luxury company (y/n), stock-index

  • country of legal domicile

Sustainalytics ranking, an sustainability scoring index (9).

The VSS node set includes 66 distinct standards with three nodal attributes.

  • VSS-area (“Materials”, “Water”, “Reporting”, “Chemistry”, “Emissions”, “Energy”,“General”, “Waste”, “Forestry”)

  • country of legal domicile

  • listed within Textile Exchange.

There are 181 edges between the two node sets.

Visualizing the Network

As we can see from the first visualization, Kering looks like it is the company with the highest degree of VSS standards. Furthermore, the majority of VSS are materials related.

Looking at degrees, Kering is confirmed to be the company with the highest number of out-ties, 44. UN Global Compact, FSC, ZDHC, Fashion Industry Charter, and Leather Working Group share the most number of in-ties for VSS, 7.

A traditional bipartite layout makes it easier to see the names of the relevant standards and again makes it very clearly that materials-related standards are the most popular.

ERGM Analysis

I made the decision to analyze my network using ERGMs because I want to understand more about the structure of my network. I hope to understand if certain nodal attributes, such as a company being a luxury company, makes it more likely that it would adopt a standard based on the strong degree centrality of Kering observed in the network. I also hope to understand if companies influence each other to join by understanding what ties are shared or not. In particular, I was influenced by both the bipartite ERGM tutorial introduced by Kuvelkar and Hunter and a chapter by Wang (6 & 7). At some point, I would like to be able to do a time-based model but that will require more targeted data and intensive collection to understand when it was that brands adopted a specific VSS.

Nodal Covariates

I am specifically interested in knowing if nodal attributes are correlated to a higher probability of a tie forming. For this proof of concept, I will be looking at whether or not a company is classified as a luxury company influences the number of VSS standards.

## Starting maximum pseudolikelihood estimation (MPLE):
## Evaluating the predictor and response matrix.
## Maximizing the pseudolikelihood.
## Finished MPLE.
## Stopping at the initial estimate.
## Evaluating log-likelihood at the estimate.
## 
## Call:
## ergm(formula = testnet ~ edges + b1cov("luxury"))
## 
## Maximum Likelihood Coefficients:
##        edges  b1cov.luxury  
##       -1.377         1.025
## Call:
## ergm(formula = testnet ~ edges + b1cov("luxury"))
## 
## Maximum Likelihood Results:
## 
##              Estimate Std. Error MCMC % z value Pr(>|z|)    
## edges         -1.3770     0.1077      0  -12.79   <1e-04 ***
## b1cov.luxury   1.0251     0.1792      0    5.72   <1e-04 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 1021.7  on 737  degrees of freedom
##  Residual Deviance:  811.2  on 735  degrees of freedom
##  
## AIC: 815.2  BIC: 824.4  (Smaller is better. MC Std. Err. = 0)

I find that that there is a statistically significant luxury effect on the probability of a tie, with a value of 1.025. However, once again, the goodness of fit indicates that the model is underestimating low degrees.

4-Cycle

The 4-cycle assumption for bipartite networks fits within my research questions. If Kering adopts Standard A and Inditex adopts Standards A and Standard B, is Kering more likely to join Standard B because both Kering and Inditex are members of Standard A. In other words, this starts to get to the question of how business relationships encourage the adoption of VSS.

## Starting maximum pseudolikelihood estimation (MPLE):
## Evaluating the predictor and response matrix.
## Maximizing the pseudolikelihood.
## Finished MPLE.
## Starting Monte Carlo maximum likelihood estimation (MCMLE):
## Iteration 1 of at most 60:
## Optimizing with step length 0.5407.
## The log-likelihood improved by 1.8072.
## Estimating equations are not within tolerance region.
## Iteration 2 of at most 60:
## Optimizing with step length 0.7083.
## The log-likelihood improved by 1.8815.
## Estimating equations are not within tolerance region.
## Iteration 3 of at most 60:
## Optimizing with step length 1.0000.
## The log-likelihood improved by 1.0650.
## Estimating equations are not within tolerance region.
## Iteration 4 of at most 60:
## Optimizing with step length 1.0000.
## The log-likelihood improved by 0.0433.
## Convergence test p-value: 0.0001. Converged with 99% confidence.
## Finished MCMLE.
## Evaluating log-likelihood at the estimate. Fitting the dyad-independent submodel...
## Bridging between the dyad-independent submodel and the full model...
## Setting up bridge sampling...
## Using 16 bridges: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 .
## Bridging finished.
## This model was fit using MCMC.  To examine model diagnostics and check
## for degeneracy, use the mcmc.diagnostics() function.
## 
## Call:
## ergm(formula = testnet ~ cycle(4))
## 
## Last MCMC sample of size 241 based on:
##   cycle4  
## -0.02339  
## 
## Monte Carlo Maximum Likelihood Coefficients:
##   cycle4  
## -0.02458
## Call:
## ergm(formula = testnet ~ cycle(4))
## 
## Monte Carlo Maximum Likelihood Results:
## 
##         Estimate Std. Error MCMC % z value Pr(>|z|)    
## cycle4 -0.024575   0.004039      0  -6.084   <1e-04 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##      Null Deviance: 1021.7  on 737  degrees of freedom
##  Residual Deviance:  948.4  on 736  degrees of freedom
##  
## AIC: 950.4  BIC: 955  (Smaller is better. MC Std. Err. = 1.168)

First, my results are significant. However, it appears that the inclusion of the 4-cycle assumption creates a negative effect of -0.02482. I can reject the null hypothesis that there is no effect but this finding does not fit within my research question. Perhaps, companies like to use VSS to distinguish themselves from their competitors and are thus less likely to adopt a VSS if they know that a company that they already “share” a standard with have already adopted that one.

## Sample statistics summary:
## 
## Iterations = 8192:131072
## Thinning interval = 512 
## Number of chains = 1 
## Sample size per chain = 241 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean             SD       Naive SE Time-series SE 
##          73.07         248.72          16.02          17.90 
## 
## 2. Quantiles for each variable:
## 
##  2.5%   25%   50%   75% 97.5% 
##  -356  -115    67   230   607 
## 
## 
## Are sample statistics significantly different from observed?
##                  cycle4 Overall (Chi^2)
## diff.      7.307054e+01              NA
## test stat. 4.082964e+00    1.667059e+01
## P-val.     4.446494e-05    6.519487e-05
## 
## Sample statistics cross-correlations:
##        cycle4
## cycle4      1
## 
## Sample statistics auto-correlation:
## Chain 1 
##               cycle4
## Lag 0     1.00000000
## Lag 512   0.10817332
## Lag 1024 -0.06118608
## Lag 1536 -0.06542343
## Lag 2048  0.01968681
## Lag 2560  0.01322173
## 
## Sample statistics burn-in diagnostic (Geweke):
## Chain 1 
## 
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5 
## 
##  cycle4 
## -0.1117 
## 
## Individual P-values (lower = worse):
##    cycle4 
## 0.9110289 
## Joint P-value (lower = worse):  0.9415208 .

## 
## MCMC diagnostics shown here are from the last round of simulation, prior to computation of final parameter estimates. Because the final estimates are refinements of those used for this simulation run, these diagnostics may understate model performance. To directly assess the performance of the final model on in-model statistics, please use the GOF command: gof(ergmFitObject, GOF=~model).

MCMC diagnostics do not indicate large levels of degeneracy. However, once again, goodness-of-fit diagnostics indicate that there are particular issues with the degree, once again underestimating the low degrees.

Attempts to adapt the model to better estimate low degrees was unsuccessful - either the network wouldn’t converge or additional specifications had the same issue. The consistency of this issue makes me think that the issue is more based in my data. To start, it is a relatively smaller network that could make it harder to generate a successful fit. Second, I was unable to consider multiple nodal attributes, a specification that could help with fit.

Conclusion

This exercise indicated several important avenues for future research. First, I need to widen my dataset and consider how to compare companies by profit and revenue and how to include private retailers that do not have much publicly available data. Second, I have recorded nodal attributes that I did not incorporate into my analysis and doing so will widen my findings. Third, this quantitative exercise needs to be accompanied by qualitative research that can seek to explain some of the trends that I observe.